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Abstract 

This paper presents a novel approach to network coding for distribution of large files. Instead of the 
. usual approach of splitting packets into disjoint classes (also known as generations) we propose the use 

of overlapping classes. The overlapping allows the decoder to alternate between Gaussian elimination 
and back substitution, simultaneously boosting the performance and reducing the decoding complexity. 
I Our approach can be seen as a combination of fountain coding and network coding. Simulation results 

. are presented that demonstrate the promise of our approach 
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I. Introduction 



Network coding [l]-[5] is a promising approach to data dissemination over networks. In this past 
decade, several works have attempted to establish the potential of this simple and yet seemingly revo- 
. lutionary idea in a variety of applications [6]-[ll]. While the success of network coding for streaming 

media and wireless applications has been encouraging, it is still unclear whether this approach is beneficial 
for peer-to-peer file dissemination [12]. The present paper is an initial attempt to fill this gap. 

One major issue is decoding complexity. In the file-downloading scenario, a large file of km logg q 
bits is to be distributed among cooperating peers in a network. The file is partitioned into k packets, each 
consisting of m symbols over a finite field Fg. If random linear network coding [4] is used to distribute 
the file, then each receiver has to solve a linear system with k equations in order to decode the file. This 
requires 0{k^ + k'^m) operations in ¥q, which may be prohibitively expensive in practice. 

^ Supported by CAPES Foundation, Brazil. 
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To reduce the decoding complexity, Chou et al. [5] proposed to group packets into disjoint generations, 
each containing d packets, and apply network coding only within each generation. The complexity issue 
is solved if d is small, but another problem is created: that of efficiently routing L = k/d generations 
throughout the network. 

Note that simply choosing a small k and compensating the file size by using a large packet length m, 
as done in [7], may not be a satisfactory solution. Transmitting such large packets (of, say, 1-4 MBytes 
[7]) over a dynamic peer-to-peer network — where peers may interrupt transmissions or leave the network 
at any time — ^is a highly nontrivial problem. Since each coded packet is essentially unique, interrupted 
transmissions are useless to a receiving peer, potentially causing a severe waste of bandwidth. Thus, we 
find it more reahstic to assume that m is small and L is large. 

Probably the most successful approach so far to routing pieces of a file through a peer-to-peer network 
is the BitTorrent protocol [13]. The drawback of this and similar protocols is that a large number of 
control messages must be exchanged between peers, mainly to resolve the problems of rare blocks and 
block reconciliation [14], [15]. Thus, the protocol overhead is substantial, and a significant amount of 
research has been devoted to trying to alleviate this problem [14]. 

The solution proposed by Maymounkov et al. [16], in the context of generation-based network coding, 
completely ehminates any protocol overhead: peers randomly choose the generation from which to trans- 
mit each packet. This scheme is called chunked coding. Intuitively, the scheme replaces protocol overhead 
with transmission overhead. While the scheme is shown to have a good performance asymptotically, the 
performance quickly deteriorates for practical values of d. 

A related line of work is fountain coding [17]. By using optimized degree distributions, fountain codes 
such as LT or raptor codes can achieve a relatively small overhead with a low-complexity back-substitution 
decoder [17]. These schemes, however, are not compatible with network coding. To maintain the designed 
degree distributions, packets must travel intact throughout the network — otherwise, the decoder would 
fail miserably. 

This paper investigates the following question: is it possible to use a true network coding approach 
and yet enjoy a low-complexity fountain-hke decoder? The approach proposed here answers this question 
affirmatively, and can be seen as a combination of fountain coding and network coding. Our idea is to 
follow the approach of chunked coding, but instead use a larger number of overlapping generations (here 
called classes). Overlapping generations allow packets from decoded generations to be back-substituted 
into still undecoded generations, in the same spirit of a fountain decoder. This not only boosts the 
performance but also reduces the decoding complexity of the scheme. 
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The remainder of the paper is organized as follows. In Section |lll we review previous work on network 
coding in a way that simplifies the description of our codes and emphasizes the existing connections. 
Section JII] presents our approach, including the description of the decoder and bounds on the decoding 
complexity. In Section JVl we present some code constructions, whose performance is evaluated in 
Section |V] and compared with that of chunked codes. Finally, Section |Vl] presents some concluding 
remarks. 

II. Preliminaries 

A. Random Linear Network Coding 

Consider a communication network represented by a directed multigraph (cyclic or acyclic). The 
network is used to transport k data (or uncoded) packets ui, . . . ,Uk from a single source node to multiple 
destination nodes. Packets are regarded as vectors of length m over a finite field Fg. Each edge in the 
network is assumed to transport a single packet, free of errors. To describe the operation of the network, 
we associate with each edge e a tuple {P(,,t~ jf^); if e is an edge from a node to a node v~^, then 
this tuple indicates that packet Pg was transmitted by v~ at time t~ and was received by at time tf. 
We may also say that Pg is an outgoing packet of and an incoming packet of v^. For consistency, 
we assume that the data packets Uj were received by the source node at time — oo. 

The computation performed at the nodes must satisfy the law of (causal) information flow: a packet 
transmitted by a node must be computed as a function of packets previously received by that node. A 
(causal) schedule for a network is a specification of all the time values f^, satisfying the constraint 

Given a network and a schedule, a network code is the specification of all functions computed at all 
nodes. In a linear network code [2], [3], all such functions are constrained to be F^-linear combinations. 
This implies that any packet Pg transmitted over the network can be expressed as a unique linear 
combination of data packets, say, Pg = '}2\=i 9e,iUi- The coefficient vector = {ge,i, • • • j 9e,k) ^ is 
called the (global) coding vector of Pg. 

Let xi, . . . , xn denote the outgoing packets of the source node, and let yi, . . . , denote the incoming 
packets of some destination node. Due to the linearity of the network code, these packets can be related 
by 

Y = AX = ABU (1) 

where U E F^^*", X G jp^x™ and Y e p^x'" are matrices whose rows are the packets Uj, x, and yi, 
respectively, and A g¥^^^ and B G F^^*^. The matrix AB is called the transfer matrix of the network. 



4 



Note that successful decoding is possible if and only if rank AB = k. In this case, the network code 
is said to be feasible. Let k* denote the maximum rank of A among all choices of the network code. 
Clearly, a feasible network code exists only if k < k*, a condition we assume hereafter. 

In random linear network coding [4], nodes choose the coefficients of the linear combinations uniformly 
at random from Fg and independently from each other. As shown in [4], a random network code is feasible 
with high probability if the field size q is sufficiently large. 

In order for the destination node to be able to undo the multiplication by AB (which is unknown a 
priori) and recover U, the usual approach is to record the transfer matrix as part of the matrix Y through 
the use of packet headers; more precisely, the left portion of U is assumed to be a A; x A; identity matrix. 
Note that this leaves space for only m' = m — k data symbols in each data packet, i.e., the effective 
throughput is scaled by ^^7^. In practice, one must choose m' » k. 

Decoding corresponds to applying Gauss-Jordan elimination on Y to convert it to reduced row echelon 
form. Note that only k linearly independent rows of Y are effectively needed. Performing Gauss-Jordan 
elimination on a A; x (fc + m') matrix requires k'^m' + \k'^{k — 1) multiplications and a similar number 
of additioni^- We will ignore the number of additions since the time to perform an addition is usually 
negligible compared to the time to perform a multiplication. We also ignore the second term in the 
operation count since, as discussed above, m' ^ A; in any realistic scheme. Thus we may say that the 
decoding complexity of random Unear network coding is k operations per data symbol. 

Due to the fact that the transfer matrix AB is dense, this scheme is also called dense network coding. 

B. Sparse Network Coding with Disjoint Classes 

For large k, dense network coding is computationally too expensive in practice. A way to alleviate 
this problem is to ensure that AB has a sparse structure. The main difficulty is that this constraint must 
be not only imposed at the source node, but also coordinated among all the internal nodes — which must 
still be able to perform network coding. 

An approach proposed in [5] is to divide packets into disjoint classes (or generations [5], groups [7], 
segments [10], chunks [16]). Suppose that k = Ld. For i = 1, . . . , A;, let us say that a packet Ui belongs 
to class £ifi€{(£ — + Id}. Now, the rule that is enforced at each network node is that only 

packets of the same class are allowed to be combined, producing a new packet of the same class. Under 

'Note that asymptotically fast methods are only useful for very large parameters (much larger than those consider in this 
paper). 
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this constraint, expression ^ can be rewritten as 

yW = ^Wi?Wc/W, e = i,...,L 

where 

and where yj , j = 1, . . . ,ni, i = 1, . . . , L, are the received packets. Note that this is essentially splitting 
the network into L parallel smaller networks. Due to the block-diagonal structure of AB, decoding can 
now be performed in -^LcPm = d operations per symbol, which may be a dramatic improvement if L 
is large. 

Increasing L also reduces the overhead in transmitting packet headers. Rather than k symbols per 
packet, the overhead is now only \\ogq L\ + d symbols per packet, corresponding to a class index plus 
a coding vector. 

The performance of this scheme, however, reduces as L increases. This is mainly due to the following 
reasons. First, separating flows into disjoint classes reduces the diversity of source-destination paths, 
which may reduce the min-cut of the network (and therefore A;*). Second, the fact that fewer packets are 
combined together within each class may increase the probability of linear dependency among received 
packets. Third, differently from the L = 1 case, nodes have to choose the class from which to produce a 
new packet at each transmission opportunity. This implies that the induced network topology is chosen by 
the nodes on-the-fly, and poor choices may lead to a poor overall system. Fourth, the decoding condition 
is "L times more constrained:" decoding is successful if and only if ran 

The first and second problems are mitigated if k and q, respectively, are sufficiently large. For the third 
problem, different strategies have been proposed, most of which require exchange of control messages. 
We will focus here on the strategy proposed in [16], which eliminates any need for feedback: nodes 
simply choose classes uniformly at random among previously received classes. This scheme is referred 
to as chunked coding. The drawback of this approach is that it exacerbates the fourth problem. A node 
may unnecessarily receive packets from a class that has already been fully decoded, while other classes 
are still incomplete; this in turn requires n to be much larger than k. The results in [16] show that the 
overhead {n — k)/k can be made comparatively small by choosing c? = In^ /c and letting k be sufficiently 
large. In practice, however, such a large d defeats the purpose of sparse network coding, since the decoding 
complexity becomes prohibitively large. 
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The bottom line for this approach of dividing packets into disjoint classes is that it simply postpones 
the scheduling problem: now classes have to be routed, rather than individual packets. Thus, if L is large, 
the same criticisms for any routing (non-network-coding) approach also apply here. 

III. Sparse Network Coding with Overlapping Classes 

In this section we present a novel scheme that attempts to overcome the drawbacks of chunked coding. 
From one perspective, the scheme can be seen as a fountain code that is fully compatible with network 
coding. 

In the following, the term class refers to a non-empty subset of {!,... ,k}. A class-based scheme 
for network coding is specified by a set of classes, C = {Ci, . . . ,Cl}, and a probability distribution 
on classes, {pi, . . . When C is understood, we may write class £ as a shorthand for class Cg. Let 
supp((7) be the support of a vector g G F^, i.e., supp((jr) = {i € {I, . . . ,k}: gi ^ 0}. For a packet x G 
with coding vector g G F^, we say that x belongs to class £ if supp(g) C Ce. Let A(x) denote the set of 
indices of all the classes to which a packet x belongs, i.e., A(x) = {£ G {1, . . . , L} : supp(5) C Ci}. With 
a slight abuse of terminology, we will usually refer to a class I to mean all the data packets belonging 
to that class. 

Note that, in general, a packet Ui may belong to multiple classes; for instance, we might have Ci nC2 = 
{i}, which implies that {1, 2} C \{ui). When two classes have non-empty intersection, we will say that 
these classes overlap. 

Given a class-based network coding scheme (C, {pe}), every node in the network (including the source 
node) performs, at each transmission opportunity, the following encoding procedure. First, a class index 
I is randomly selected according to {pt}. If no packets from that class have yet been received, then the 
process is repeated until an index i is selected such that some packet from class I has been received. 
Then, an outgoing packet is computed as a random linear combination of received packets from class 

Let = \Ct\, for £ = 1,...,L. It should be clear that the chunked coding scheme described in 
Section Hl-BI corresponds to the special case where C is a partition of {!,..., k}, with di = d = k/L, 
and {pe} is uniform. In general, due to the presence of overlapping classes, we may have Yle=i > k. 

Let us now describe the decoding process. For i = 1, . . . , L, let Y^^^ consist of the received packets 
from class £, and let = ran kyW. We view Y^^\ and therefore r^, as variables that are constantly 
updated as new packets are received; in particular, we call the tuple (ri, . . . , r/,) the state of the receiver. 
In the context of a decoding process, we say that a class £ is decodable if ri > de and that it has 
been decoded if all the data packets Uj belonging to £ have been recovered. Decoding starts from some 
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(b) Grid code 
of a 2 X 2 grid code. 



decodable class £ that has not yet been decoded. This class is decoded by Gaussian elimination. Then, 
similarly to the decoding of fountain codes, any data packets Ui belonging to Ce are back-substituted into 

any overlapping classes, and the ranks ri, . . . are recomputed. For instance, if Ui belongs to classes 

(2) 

1 and 2, and class 1 is decoded, then we may imagine that a new packet y^^+i = Ui has been received. 
Unless class 2 has already been decoded, this has the effect of increasing r2 by one unit. The process 
is then repeated until all classes have been decoded — which is to say that all data packets Ui have been 
obtained. 

The essence of the decoding process is similar to solving a crossword puzzle: when a word is "decoded," 
the recovered letters can be reused to help in the decoding of any overlapping words. Indeed, the idea 
of a crossword puzzle gives the basis for the simplest nontrivial overlapping scheme, which we call grid 
codes. A simple example of a grid code is given in Fig. [Tb] A general definition will be presented in 
Section |IVl 

Example 1: Let k = A. The 2x2 grid code of Fig. [lb] can be seen as the chunked code {Ci, C2} of 
Fig. [la] with two extra classes C3 and C4. Let us assume that, in either case, packets from each class 
are received with equal probability and all received packets are innovative. Suppose that, initially, two 
packets from Ci have been received, i.e., ri = 2, so that the decoder is in state (2,0,0,0). For the 
chunked code to succeed with no overhead, it is necessary that the next two received packets belong to 
C2, an event that happens with probability 1/4. 

On the other hand, for the grid code to succeed, there is much more flexibility in the possible 
received packets; more precisely, all the receiver states (2, 2, 0, 0), (2, 1, 1,0), (2, 1,0, 1) and (2, 0, 1, 1) 
are decodable. For instance, suppose that the next two received packets belong to C2 and C^, i.e., the 
receiver state is (2,1,1,0). Decoding proceeds as follows. First, class 1 is decoded using Gaussian 
elimination, which yields uncoded packets ui and U2- Since ui and U2 are also from classes 3 and 4, 
respectively, the state is updated to (2, 1, 2, 1). Now class 3 can be decoded, uncovering packet u^. Since 
ti3 also belongs to class 2, the state becomes (2,2,2,1). Now class 2 is decoded, which finally reveals 
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the last packet u^, completing the decoding. Thus, if the initial state is (2, 0, 0, 0) and two more packets 



Let us now examine the issue of decoding complexity. We first describe an alternative way to view 
the decoding process. Note that, for each new packet Ui that is recovered, one variable is effectively 
removed from the problem for all the remaining classes. Thus, rather than increasing ri, . . . , tl at each 
decoding iteration, we can equivalently decrease di,... ,dL. This has precisely the same effect in the 

(i) 

decoding condition > d^. More precisely, let d\ denote the size of class I (in terms of remaining 
variables) after the zth decoding iteration. Initially, df^ = d^, for all £. After the ith iteration, when, say, 

(i) (i—l) 

class i* is decoded, we update dy = d^^ — |Q n Ci-\, for all classes that have not yet been decoded. 

(i) (i—l) 

We keep dy = d^ for the decoded classes, since this tells us precisely the size of the problem that 
was solved for class £, i.e., how many packets had to be decoded by Gaussian elimination. Thus, at the 
end of the decoding process, say, after iteration t, we should have X]^=i — ^' which is precisely the 
total number of variables. Using this description of the decoding process, we can provide the following 
bound on the decoding complexity. 

Theorem 1: Let di^,. . . , dg^ denote the sizes of all classes sorted in decreasing order. The worst-cast 
decoding complexity in operations per symbol, is upper bounded by 



where t is the smallest integer such that Yli=i > k. 

Proof: Without loss of generahty, suppose that classes are sorted according to the order in which 
they are decoded, i.e, class 1 is decoded first, then class 2, and so on. Let t be number of iterations 
after which decoding is complete. Class 1 is decoded first, after which d2 — (^2^ uncoded packets are 
forwarded to class 2. By examining the matrix of the linear system that has to be solved for class 2, it 
is easy to see that this system can be solved with precisely d^2^d2'm = d2^d2m operations. In general, 
each class i can be decoded with df^dim operations, giving a total complexity of 



are received, the grid code succeeds with probabihty 4/10 > 1/4. 




t-i 



i=l 




operations per symbol. 



To obtain a bound, we need to maximize the function J2e=i dexe, subject to the constraints < < d(, 
£ = 1,... ,L, and Yle'=i^i = k. It is clear that this function is maximized by choosing = d(., 
i = 1, . . . ,t — 1, and X£^ = k — X^*Z} <i^., where t is the smallest integer such that Yl!^i=i '^U ^ k. Thus, 
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we obtain 

^t-i ^ / t-i \ 

i=l \ i=l / 

1 

with equality if = i, i = 1, . . . ,t, and Ci, . . . , are disjoint. 

The second expression follows from di^ < di^ and Yll=i — ^- '^^^^ 

1 

i=l 

m 

Theorem [T] shows that the complexity is dominated by the largest t classes and is not increased by 
adding any number of classes that are smaller than the largest t classes. In particular, for a code with 
fixed-size classes, the complexity is never greater than that of the corresponding chunked code. For 
general codes, we should in fact expect a complexity much smaller than the bound of Theorem [T] This 
is because that bound is achieved when the first t classes to be decoded are the largest ones and are 
disjoint, while in practice we would expect smaller classes to be decoded first and be back-substituted 
into larger ones. 

Evaluating the performance is a much harder issue. This is due to the fact that Gaussian elimination 
is combined with back substitution in a recurring manner, leading to an extremely intricate decoding 
process. Nevertheless, for simple cases, we can compute the performance exactly. Fig |2] shows the exact 
probability of successful decoding versus overhead for the 2x2 grid code of Fig. [lb] It can be seen that, 
for the same complexity, the performance of this grid code uniformly better than that of the corresponding 
chunked code. 

IV. Examples of Codes 

In this section, we present some examples of codes with overlapping classes. The performance of these 
codes will be investigated in Section IVl 

Definition 1: Let k = dd' . A d' x d (rectangular) grid code C = {Ci, . . . , Cl} consists of L = d + d' 
classes given by 

Ci = {{i-l)d + j\j = l,...,d}, i = l,...,d' 
Cd'+j = {{i - l)d + j \ i = I, . . . , d'}, j = l,...,d. 
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2X2 case, Chunk vs Grid 




01 23456789 10 

Overhead e = n-k 



Fig. 2: Comparison between a 2 x 2 grid code and a chunked code with d = 2 and k = 4. 



Note that, in a d' x d grid code, the first d' classes have size d and form a partition of {1, . . . , A;}, 
while the last d classes have size d' and also form a partition of {1, . . . , A;}. 

When all classes have the same size, i.e., d' = d, we obtain square grid codes. These codes are, 
unfortunately, too restrictive, since we must have k = d^. K way to span a higher number of packets 
with fixed-size classes is provided by diagonal grid codes. For convenience, in the next definitions, 
assume that packet and class indices are numbered starting at zero. 

Definition 2: Let k = L^d and assume L < (Lq)^. A (k,d,L) diagonal grid code with angle set 
G = {6q, . . . , 0ip„i} consists of L classes given by 

Ci = {{i + jes)d + j mod k \ j = 0, . . . ,d - 1} , i = £ mod Lq, s = [i/Loj, e = 0,...,L-l. 

A diagonal grid code with angle 9 is a diagonal grid code with angle set © = {0, 9, 29, . . .}. 

For s = 0, . . . , IL/Lq] — 1, the classes CsLo,- ■ •»C(s+i)Lo-i for^i ^ partition of {0, . . . , k — 1}. In 
particular, the Lq classes with angle 9s = correspond to a chunked code. An example of a diagonal 
grid code is given in Fig. [3] 



11 




Every Q is a distinct input packet 



Fig. 3: Example of a (15, 3, 10) diagonal grid code. 



The design of a diagonal grid code minimizes the maximum size of the intersection of two classes. It 
is easy to see that, if all the nonzero 9s are relatively prime to Lq, then any two distinct classes overlap 
in at least [L/LqJ — 1 and at most \L/Lq \ — 1 classes. To see that this value is optimal, consider a 
bipartite graph with packets and classes as nodes, and an edge connecting a packet to a class if that 
packet belongs to that class. It follows that the maximum degree of a packet must be at least the average 
degree {Ld)/{LQd) = L/Lq. 

Note that a diagonal grid code consists essentially of multiple layers of chunked codes each applied after 
the packets 0, . . . , A; — 1 undergo a certain (grid-like) permutation. Thus, the construction of Definition |2] 
can be generalized by using arbitrary permutations. For s = 0, . . . , [L/Lq] — 1, let tTs be a permutation 
of {0, . . . , A; — 1}. Then we may consider a code with L classes of size d given by 

Ci = {TTs{id + 3) I j = 0,...,o!-l}, i = £modLo, s= [^/LqJ, ^ = 0,...,L-1 

where Lq = k/d. Without loss of generality, we will assume that ttq is the identity permutation. If all the 
remaining permutations are chosen uniformly at random, we will call the resulting code a random-layer 
code. 

For generality, in all the codes described above, we have left the probability distribution {pi} unspec- 
ified. However, in the case that all classes have a constant size d, it is quite natural to use a uniform 
distribution p£ = 1/L for all i. More generally, we see no reason to assign different probabilities for 
classes of the same size, and we will use this assumption in all the experiments in the next section. 

V. Performance Evaluation 

In this section, we use simulations to evaluate the performance of the codes described in the previous 
section. 

We make the following assumptions: 
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1) All received packets are linearly independent whenever possible, i.e., rank = min{n£, de}, 
for all i. 

2) The probability that a received packet belongs to class i is exactly equal to p£, for all £. 

Note the two assumptions above concern themselves with the network topology and the network code, 
and they are required if we wish to pursue an analysis that is independent of the network. Assumption 1 
implies that the source node must generate a sufficient number of packets from each class {N£ > di) and 
that both the encoding at the source node and the network code must not introduce any linear dependence 
on any set of up to d^ received packets. Assumption 2 means that the network preserves the designed 
probability distribution on classes. Both assumptions should hold true if q and each dp are sufficiently 
large. In order to satisfy this requirement, we assume that a parameter dmin is given such that any valid 
code must satisfy dp > dmim for all i. Specifically, we consider dmin = 25 in the following results. Note 
that the value of q does not affect code design. 

Performance is evaluated in terms of the complexity-overhead tradeoff. Since the problem is inherently 
delay-tolerant — each receiver is interested in receiving the complete file with probability 1, no matter 
how long it takes — the two main figures of merit are the expected complexity and the expected overhead. 
Note that the figure of expected overhead automatically incorporates the probability of failure for each 
specific overhead, therefore eliminating the need to consider a three-dimensional tradeoff space. 

Fig. |4] shows how complexity is traded off against overhead in a chunked coded. At one extreme, we 
have a dense code with a single class of size d = k; this code has optimal overhead but prohibitively 
large complexity. At the other extreme we have a chunked code with class size d = d^i^, which attains 
the minimum possible complexity at the expense of a large overhead. As shown in Fig. IH for small to 
moderate complexity, diagonal grid codes can outperform chunked codes by a large margin. Note that 
the complexity of diagonal grid codes is precisely equal to the class size d. The number of classes L for 
each grid code has been tuned experimentally to maximize the performance for the given parameters. 
From left to right, the points in Fig. |4] correspond to L = 28, 12, 9, 2. 

Fig [5] shows similar results for a scenario where k = 4096. As one can see, well-designed grid codes 
significantly outperform chunked codes. From left to right, the grid codes in the figure have L = 207, 
92, 43. Fig |5] also shows the performance of codes with varying class sizes, referred to as mixed codes. 
From left to right, these codes are: a (4096, 32, 200) diagonal grid code with an additional random 
class of size 2048; a (4096,64,86) diagonal grid code with an additional random class of size 1024; 
and a (4096,128,38) diagonal grid code with an additional random class of size 512. In all cases, the 
distribution {p^} used is the uniform one. In comparison with their corresponding grid codes, the mixed 
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k = 1 000 
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Expected complexity (operations per symbol) 



Fig. 4: Performance of chunked codes and diagonal grid codes for k = 1000. 

codes exhibit a significantly lower overhead with only a marginal increase in complexity. As discussed 
in Section InlJ this is due to the fact that the extra (large) class is typically decoded only after many other 
(smaller) classes have been decoded and back-substituted. The effect of a large class is analogous to that 
of a high degree check in LT codes: establishing a "bridge" between non-overlapping classes and thus 
allowing the decoding "ripple" [17] to be maintained for a longer time. 

Our results show that, for a fixed expected complexity, the use of overlapping classes can reduce the 
expected overhead by up to 70%. 

VI. Concluding Remarks 

This paper presents a novel approach to network coding based on the concept of overlapping classes. 
The approach generalizes chunked coding and allows a propagative decoder that enjoys many of the 
benefits of fountain codes. Our proposed scheme, while still in its initial stages, seems to be a promising 
step towards a full network coding solution to peer-to-peer file distribution. More generally, our approach 
seems to be suitable for any application that would benefit from a combination of fountain coding and 
network coding. 
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Fig. 5: Performance of chunked codes, diagonal grid codes and mixed codes for k = 4096. 

We remark that, while our analysis assumes no feedback between nodes, nothing prevents a protocol 
based on our scheme from using control messages to aid the communication. By carefully designing the 
amount of protocol overhead, the overall performance of the scheme may be further increased. 

Our main objective with this paper has been to suggest a new possible direction for research in network 
coding, as more questions remain open than have been answered here (especially in the theoretical side). In 
particular, the design of good codes with constant or non-constant class sizes (and possibly nonuniform 
distribution) is an important open problem. Due to the recursive nature of the decoding process, the 
development of analytical bounds on performance also remains elusive at this point. We hope to address 
both problems in our future work. 
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